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Abstract 


In this paper we offer a formal, rigorous proof of the correctness of Awerbuch’s algorithm for 
network synchronization [1]. We specify both the algorithm and the correctness condition using 
the I/O automaton model. Our proof of correctness follows Awerbuch’s intuitive arguments 
closely by exploiting the model’s natural support for techniques of stepwise refinement and 
modularity. We demonstrate that the model is a powerful tool for reasoning about distributed 
algorithms. 
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1 Introduction 


We prove the correctnesss of a distributed protocol for network synchronization. Most networks 
do not offer reliable bounds on the time a message takes to arrive, so it is important to find 
algorithms that work correctly in an asynchronous system. It is, however, much easier to design 
algorithms if the network is synchronous. Awerbuch [1] proposed the use of a synchronizer that 
would enable one to convert any synchronous graph algorithm into an algorithm that performs 
correctly in an asynchronous (but failure-free) network. Such synchronizers have been used to give 
efficient asynchronous algorithms ([2]). We give a formal proof of correctness of this algorithm. 

In [1], a synchronizer (called y in that paper) is constructed for a network whose topology is 
any fixed, connected graph. The algorithm is described in terms of sub-algorithms that run on 
trees of a suitable spanning forest subgraph, and a distributed technique is given for finding a such 
a subgraph for which the resulting algorithm has low time and message complexity. 

The synchronizing algorithm is derived as a composition of a simple synchronizer (called 8) 
executing within each “cluster” (the set of nodes in each tree of the spanning forest subgraph), 
and another simple synchronizer (called a) that synchronizes between the clusters. While this 
description helps to explain the detailed algorithm, no formal proof of correctness is offered in [1]. 
We provide a formal account of an algorithm closely based on Awerbuch’s description, and rigor- 
ously prove results about its correctness. Our exposition follows Awerbuch’s informal arguments 
by building on claims that express formally the correctness of algorithms @ and § that work at the 
level of the clusters and in between them (the spanning forest subgraph that defines these clusters 
is assumed to be given). The proof is modular and hierarchical, and uses the I/O automaton for- 
malism to define the problem and to describe and prove the correctness of the algorithm offered as 
solution. The reader is referred to [3] for a treatment of I/O automata. 

We define synchronizing behavior by specifying an I/O automaton that uses global information 
about the system to coordinate the behaviors of all the client automata. Our goal is to arrive at 
algorithms (automata) running on each of the nodes of the networks that, acting together, emulate 
the behavior of the global synchronizer. More precisely, we will produce a collection of automata 
which is such that from the point of view of each client, executions implemented by the collection 
are indistinguishable from those resulting from the presence of the global synchronizer. 

First, we provide a specification for a global network synchronizer by giving a single I/O au- 
tomaton that synchronizes across the network by exchanging messages directly with each node in 
the network. We then repeatedly refine this specification according to Awerbuch’s construction. At 
each step, we will specify a collection of automata that will “implement” the synchronizing behavior 
of the global synchronizer in progressively “more distributed” settings. We will state and prove 
the corresponding correctness claims that justify each refinement, and finally arrive at a collection 
of automata that will run on individual nodes of the network - a collection that, we will prove, 
continues to synchronize the network. 

After a brief description of the setting, notations and conventions in Section 1.1, we proceed in 
Section 2 to give a general specification for client automata, which are node automata that execute 
steps of a given synchronous graph algorithm that we wish to execute on the network. We then 
define the previously described global synchronizer glob_synch. Having thus laid out the problem 
and goal, in Section 2.2 we detail a collection of automata that “implements” glob_synch. One of 
the components of the collection is a “local” synchronizer (called loc_synch in this report), which 


ensures that adjacent nodes are always within one synchronous-algorithm timestep of each other 
at all times. 

We prove in Section 2.3 that, as a consequence, no client can distinguish glob_synch from the 
collection, and hence that the collection is, in a weak sense, an implementation of glob_synch. 

A stronger (and conventional) notion automaton implementation involves behavior inclusion: 
We say that the composition C of a collection of automata implements an automaton A, if every 
behavior of C’ is a behavior of A. 

It is this definition that will apply in subsequent sections when we consider the task of imple- 
menting loc_synch in a distributed setting. 

In Section 3.1, we describe how loc_synch is implemented at the level of each of Awerbuch’s clus- 
ters, and in between clusters. We prove that every external behavior of the composition of automata 
described is a behavior of loc_synch. In later sections, we provide distributed implementations and 
the necessary proofs of implementations of the intracluster and intercluster synchronizers that real- 
ize loc_synch. We dwell briefly on some highlights of the proof presented and end with a summary 
in Section 4. 


1.1. Preliminaries 


In this section, we describe the setting of the problem, and introduce some conventions and notations 
to be used throughout this report. 

We are working with an asynchronous, failure-free network, whose topology is described by a 
fixed, connected, communication graph G = (V, E), where the set of nodes V represents processors 
of the network and the set of links F represents the communication channels between them. Mes- 
sages can be exchanged between pairs of neighboring nodes, and arrive in the order in which they 
are sent with finite but unbounded delays. 

We assume a basic data type message. Ordered pairs of the form (m,q), where m is a message 
and q € V is a vertex, are said to be of type tagged message, and we refer to (m, q) as the tagged 
message tagged by g. We use € to denote the null message. 

We use “x” to denote a parameter we allow to take any permitted value of the right type 
(Restrictions on the the type and the values will be clear from the context). 

With explicitly noted exceptions, Greek letters 7, and o are used to denote actions of I/O 
automata, and G, 3’, y and 6 to represent sequences of those actions. 

We call neigh(p) the set {q € V|(p,q) € E}. Wherever used, i and j are always nonnegative 
integers. 

To understand the functioning of client automata, we will need the idea of preservation of 
certain properties by clients. We provide a formal definition here, deferring motivations for use to 
the next section, where clients are discussed. 


Definition 1.1 Let M be an I/O automaton and P (the set of properties to be preserved) be a 
non-empty, prefiz-closed set of sequences of actions from a set © satisfying ®M int(M) = ¢. M is 
said to preserve P if Bx|® € P whenever B\S € P, m € out(M), and Bx|M € finbehs(M). 


Variables s and ¢ denote states of automata. The following definition describes our notation for 
talking about states of compositions of automata. 


Definition 1.2 Let A = J]; A; be a composition of the collection of automata {A;}. Vs € states(A), 
s[A;] ts the state of component automaton A; when A is in state s. 


Throughout this report, unless otherwise specified, all arrays of automata state variables store 
values of type boolean, and all I/O automata implementations are such that all boolean variables 
are initially false, and once set to true by an action, remain true forever thereafter during that 
execution. With one exception (the client automata, Section 2.1.1), all I/O automata implemented 
in this report have no internal actions, are completely deterministic and are such that once any of 
their output actions is enabled, it is disabled only if it is performed. 

In the next section we lay out the basic problem of network synchronization using the I/O 
automaton formalism. 


2 The Network Synchronization Problem 


In this subsection, we will formalize the network synchronization problem. 


2.1 <A specification of a globally synchronous system 


With each node p € V, we will associate an automaton client(p) that, we will assume, runs the 
rounds of the synchronous algorithm we wish to execute on the network. 

In this section we will give an informal overview of synchronizing behavior and the workings 
of client automata and glob_synch, and then translate that description into the language of I/O 
automata. 

The client automata are assumed to follow a protocol whereby, at the end of each round, they 
each output a tagged message set that contains all outgoing messages, and then wait for an input 
action delivering incoming messages before commencing computations for the subsequent round. 

The automaton glob_synch ensures synchronous execution of the underlying algorithm by wait- 
ing until all the clients have output their outgoing tagged message sets before delivering incoming 
messages to any client. 

We now model the above in the framework of the I/O automaton formalism. We begin with 
the specification of the client automata, and follow through with that of glob_synch. 


2.1.1 Client automata 


For concreteness, we assume that each client automaton’s external actions consist of an input action 
and an output action. The output carries out round-specific messages to neighboring nodes, and 
the input action conveys that rounds’ incoming messages from neighboring nodes to the client. 
Furthermore, we attribute a liveness property to the client automata by requiring them to always 
respond to incoming messages, whenever they are given the chance. This restriction allows us to 
work in the same network-failure free model that is the setting for Awerbuch’s algorithm. 

We make the assumption that clients are I/O automata and give below the specification of 
a client automaton at an arbitrary node p € V (client(p)) that captures the above requirements 
formally. 


Signature 
The client automaton at each node p has the action signature given below. 


Input: 

client_input(p, M,z),M a set of tagged messages 
Output: 

client_output(p, M,i),M a set of tagged messages 
Internal: 

arbitrary 


The notation for describing action signatures has a straightforward interpretation, and is used 
consistently throughout the remainder of this report. Consider, for example, the client’s output 
action. Each client has an output action that, in some unspecified manner, contains three pieces of 
data in it to be conveyed to the automaton sharing this action with the client: A node identifier 
p that, in this case, identifies the client that performs the action, a tagged message set M of the 
indicated type that contains outgoing messages, and i, a positive integer (by convention), that is 
typically interpreted to be the round number of the synchronous algorithm during which the action 
occurred. 

A similar interpretation applies to the client’s input action. Internal actions of the client, which 
will include computations for the underlying synchronous algorithm, are left unspecified. 

We continue with the description of the restrictions on the clients. 

For each p € V, associate a set of p-well-formed sequences defined as follows: 


Definition 2.1 A sequence (finite or infinite) is said to be p-well-formed if it consists of alternat- 
ing occurrences of client_output(p, *, *) and client_input(p, *, x), starting with client_output(p, *, 1) 
followed by client_input(p, *,1), such that with each occurrence of client_output(p, *,7), i advances 
by 1, and with each occurrence of client_input(p, *,j), J advances by 1. 


Let W, be the set of p-well-formed sequences. We require the client at node p to preserve W, and 
be such that no fair behavior of that automaton ends in client_input(p, +, *). 

By stipulating that clients preserve p-well-formedness, we ensure that clients do not send a 
message “out of turn” - the client is constrained to wait for incoming messages for a round before 
it can send out the outgoing messages for the next round. It is this requirement that supports our 
interpretation of the client as an automaton executing steps of a synchronous algorithm. Clients’ 
input actions thus aid in the information flow of the underlying synchronous algorithm, and serves 
as a “clock pulse”, telling the client to commence computations for a fresh round. 

The constraint on the last action of a client’s behavior guarantees that the client will not simply 
“crash” in the middle of a protocol - an requirement consistent with our assumption that the system 
is failure-free. 

We now proceed to describe the automaton glob_synch that will model synchronous execution 
on the network. 


2.1.2. The global synchronizer 


The global synchronizer (depicted in figure 1) will work as follows. 
After a client automaton finishes the computations that are to be carried out at that node for 
one round of the synchronous algorithm, it sends a client_output message containing all messages 
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Figure 1: The Global Synchronizer and Clients 


to be communicated to its neighbors in the graph to the synchronizing automaton. The automaton 
glob_synch waits until it has received similar messages from all the clients, repackages messages by 
destination node, sends to each client one client_input message containing all messages from its 
neighbors for that round, and waits for responses in the form of client_output messages of the next 
round. 

Since each of the clients preserve the appropriate well-formedness property, the global synchro- 
nizer’s functioning thus ensures synchronous execution of the underlying synchronous algorithm. 

The following implementation of glob_synch makes it work as described. 


Signature 


The automaton shares each input and output action with the clients in the network, and we proceed 
to describe its functioning formally. 


Input: 

client_output(p, M,i),p € V, M a set of tagged messages 
Output: 

client_input(p, M,1),p € V, M a set of tagged messages 
Internal: 

none 


State 


array tray(p,i), p € V, of sets of tagged messages, initially all empty 
array client_output_recd(p,i), pe V 
array client_input_sent(p,1), p€ V 


Transitions 


client_output(p, M, 1) 
Effect: 


s.client_output_recd(p,i) = true 
Yq s.tray(g,2) = s'.tray(g,2z) U{(m,p)|(m,q) € M} 


client_input(p, M, 1) 
Precondition: 
Yq € V, s'.client_output_recd(q,i) = true 
M = s'.tray(p, 2) 
s'.client_input_sent(p,t) = false 
Effect: 


s.client input_sent(p, 2) = true 


Partitions 


All client_input(p, *, x) actions are in one class, for each p € V. 


Our interpretation of the network synchronization problem is that it is the task of implementing 
the glob_synch automaton in a distributed setting in a fair manner. That is, provide implementa- 
tions which simulate fair behaviors of glob_synch. 

In the following two sections we will provide a provably correct fair “implementation” of the 
composition automaton glob_synch and all the client automaton. 

The automaton glob_synch works by holding back clients from executing successive steps of 
the algorithm until all clients have completed all message exchanges for previous steps; a “locally 
synchronous” implementation of glob_synch we are now going to consider will instead have each 
client wait only until all its neighbors have finished their work for the earlier rounds. 

Our motivations for looking at such an implementation are twofold. Firstly, it should be clear 
that waiting until the latter condition is met is substantially easier to realize in a network. That 
aside, the fact that Awerbuch’s synchronizer meets this weaker requirement and not that of the 
global synchronizer - this is intuitive - renders this implementation a natural choice owing to its 
potential to be refined into Awerbuch’s synchronizer. 

The flip side of this decision, though, is that we have to deal with a collection of automata 
which does not quite “implement” glob_synch in the usual sense (involving behavioral inclusion). 
What we will establish in the upcoming sections is that the collection we are going to provide 
(we will have called it LF by then) simulates the existence of glob_synch from the viewpoint of 
each client executing the synchronous algorithm, though each execution may itself not be globally 
synchronous. To be precise, we will show that for every behavior of LF interacting with client 
automata, there exists a behavior of the composition glob_synch, also interacting with clients, such 
that both behaviors project identically on each client - thus maintaining the semblance of global 
synchrony through local synchronization: 

Theorem Let { be any behavior of LF - Il,cyClient(p). Then there exists 
¥ € behs(glob_synch - IpeyClient(p)) such that for all p € V, A|Client(p) = y|Client(p). 


We shall forthwith proceed to the locally synchronous fair implementation of glob_synch. 


2.2 An implementation of the global synchronizer 


The automaton glob_synch is modeled as a collection of several automata: One front-end on each 
node that shares actions with the client automaton on that node, two link automata L(p — q) and 
L(q — p) between the front ends of every two clients that happen to be on neighboring nodes p 
and q, and the local synchronizer that we shall call loc_synch, that shares messages with all the 
front ends. 

The front ends interface with the client automaton and receive tagged message sets from their 
clients that contain messages to be communicated to client automata on neighboring nodes. Each 
front-end sorts and distributes its own client’s outgoing messages (to other front-ends, via the 
bridging link automata), acknowledges receipt of incoming tagged messages thus distributed and 
collates the incoming messages into a tagged message set. The tagged message set is then forwarded 
to the client if the front-end has been granted permission to do so by loc_synch. 

When the receipt of all tagged messages sent out has been acknowledged, each front-end noti- 
fies loc_synch of this condition (this is the “safety” condition in Awerbuch’s description). When 
loc_synch receives this signal from a front-end and all its neighbors, it authorizes that front-end 
to forward the tagged message set of incoming messages to its client, that is, it tells the front-end 
to generate the clock pulse that starts off the next round. Figure 2 depicts the various automata 
described and the actions they share among themselves. 

We now give the formal specifications of front-end and link automata, and the local synchronizer, 
in that order, in each case also partitioning locally controlled actions into fairness classes. 
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Figure 2: Automata implementing the global synchronizer 


11 


2.2.1 Front end automata 


The front-end automaton (front_end(p)) at any node p € V has the following formal specification. 
We remind the reader of our convention that, unless noted otherwise, all arrays of automata state 
variables store boolean values, and that all these values are initialized to be false. 


Input: 
client_output(p, M,7z),M a set of tagged messages 
send’? (q, p, 11,7), q € neigh(p), » a set of messages 
ack'"?(q, p, 2), q € neigh(p) 
go(p,t) 

Output: 
client_input(p, M,i), M a set of tagged messages 
send™' (p,q, #2), q € neigh(p), u a set of messages 
ack*(p, q,i),q € neigh(p) 
ok(p, 2) 

Internal: 
none 


State 


array client_output_recd(i) 

array client_input_sent(i) 

array pkt_from(q, 1), q € neigh(p) 

array pkt_for(q,i), ¢ € neigh(p) 

array pkt_sent(q,7), q € neigh(p) 

array ack_recd(q,i), ¢ € neigh(p) 

array ack_sent(q,i), q € neigh(p) 

array go_recd(t) 

array ok_sent(t) 

array outboz(p,7) of sets of messages, initially all empty 
array inboz(z), of sets of tagged messages , initially all empty 
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Transitions 


client_output(p, M,z) 
Effect: 
s.client_output_recd(?) = true 
Vq such that (m,q) € M, for some m 
s.outbox(q,7) = {m|(m,q) € M} 
s.pkt_for(q,t) = true 


send™* (p, q, 4,2) 
Precondition: 


s' pkt_sent(q,t) = false 

s' pkt_for(q,7) = true 

p = s' outbor(q,?) 
Effect: 

s' pkt_sent(q,i) = true 


ack'"? (q, p, t) 
Effect: 
s.ack_recd(q, 7) = true 


send'”?(q, p, 1,1) 
Effect: 


s.inbox(z) = {(m,q)|m € p} U 8’ .inbox(z) 
s.pkt_from(q,7) = true 


ack™* (p, q, i) 

Precondition: 
s'.pkt_from(q, t) = true 
s'.ack_sent(q,t) = false 

Effect: 
s.ack_sent(q,t) = true 


ok(p,t) 
Precondition: 
s'.client_output_recd(z) = true 
Vq € neighbors(p), 
if s' .pkt_for(q,t) = true 
then s'.ack_recd(q,2) = true; 
s'.ok_sent(z) = false 
Effect: 


s.ok_sent(i) = true 


go(p;t) 
Effect: 
s.go_recd(z) = true 


client_input(p, M,1) 
Precondition: 
s'.go_recd(t) = true 
M = s' inbox(t) 
s' client input_sent(i) = false 
Effect: 


s.client_input_sent(i) = true 
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Partitions 


All client_input(p, *, *) actions in one class, 

all send™* (p, q, *, *) actions in one class, for each g € neigh(p), 
all ack (p, q, *, *) actions in one class, for each q € neigh(p), 
all ok(p, *) actions in one class. 


2.2.2 Link automata 


Front-end automata exchange messages through link automata that model communication channels 
between nodes in the network. We model an edge between any two nodes p and q of the network 
with two unidirectional link automata L(p — q) and L(q — p) which, which carry messages from 
one front-end to another. We specify L(p — q) below. 


Input: 
send™ (p,q, #,i), # a set of messages 
ack™* (p, q, i) 


Output: 
send’? (p, q; t,t), # a set of messages 


ack'"? (p, q, t) 


Internal: 
none 


State 


buf fer, a first-in first-out queue of elements of arbitrary type, initially empty 
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Transitions 


send™* (p, q, 4,2) 
Effect: 


Add send(p,1) to buf fer 


ack (p, q, 4) 
Effect: 


Add ack(z) to buf fer 


send’? (p,q, 1,4) 
Precondition: 
first(s’.buf fer) = send(y, i) 
Effect: 
s.buf fer = rest(s’.buf fer) 


ack*"?(p, q, 1) 
Precondition: 
first(s’.buf fer) = ack(i) 
Effect: 
s.buf fer = rest(s’.buf fer) 


Partitions 


All send™* (p, q, *, *) actions in one class, 
all ack™*(p, q, x) actions in one class. 


2.2.3. The local synchronizer 


As mentioned earlier, loc_synch ensures that a client’s neighbors have all executed a round of the 
synchronous algorithm before allowing it to begin the next round. 


Input: 
ok(p,t),pEV 

Output: 
90(p,t), pEV 


Internal: 
none 


State 


array go_sent(p,i), pEV 
array ok_recd(p,i), pe V 
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Transitions 


ok(p, t) 
Effect: 
s.ok_recd(p,i) = true 


go(p, 2) 
Precondition: 
Vq such that q € neigh(p) U {p}, s'.ok-recd(q,i) = true 
s'.go_sent(p,i) = false 
Effect: 
$.go_sent(p,i) = true 


Partitions 


All go(p, *) actions in one class, for each p € V. 


In the upcoming section, we will show how every locally synchronous execution of a synchronous 
algorithm corresponds to a globally synchronous execution with respect to any client. 


2.3 Proof of implementation 


Let LF be the automaton formed by the composition of the Local synchronizer with all the 
Front_end(p) automata and the links in between them, and LC = LF -I,eyClient(p), be the “Lo- 
cal synchronizer and Client automata” composition. As mentioned earlier, automaton glob_synch 
differs from its “implementation” ZF in that it begins a new round of communication with its 
clients only after they have all replied to the earlier set of messages. In the case of EF’, on the 
other hand, loc_synch allows a client to proceed to a subsequent round as soon as the client itself 
and all its neighbors have responded to the earlier rounds’ messages. In this section we will argue 
that this divergence in functionality is acceptable by proving that LF “implements” glob_synch in 
a precisely defined, but unconventional sense. Put in informal terms, what we will show is that as 
far as each client can tell, it will be running in a synchronous system. 

Let GC (“Global synchronizer and Client composition”) be the composition of glob_synch with 
all the client automaton. Our task will be to show that given any behavior 6 of LC, there exists a 
behavior y of GC such that 6 and y “look alike” to each of the clients. That is: 


Theorem Given a behavior 6 of LC, there exists y € behs(GC) such that for all p € V, 
A|Client(p) = y|Client(p). 


To this end, we will reorder the sequence of actions in 8 to get 8’ such that 6’ under an 
appropriate projection gives y, a behavior of GC with the required property. 

To construct the required permutation ’ of 6, we will define a relation that captures a depen- 
dency notion among the actions in 8, and will prove that any permutation of these actions that 
is consistent with the relation is also a behavior of LC. We will then prove that it is possible to 
transform any ( into a sequence of actions 6’, while preserving the constraints imposed by the 
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relation, so that ’ mirrors the way GC interacts with its clients (with outgoing messages to clients 
being sent only after each client has responded to all previous communications). From the fact 
that 8’ is also a behavior of LC, we will be able to show that its projection y on acts(GC) yields 
a behavior of the latter automaton, and that 6 and y project identically on each of the clients. 
After proving that LC thus “implements” GC, we will justify why executions of GC thus 
generated are fair to its component automata, thus showing that LC results in fair implementations 


of GC. 


2.3.1 Basic properties of the locally synchronous system 


Before we state the dependency relation, we will prove some results that will allow us to reason 
about LF’s working. 

Each client’s proper behavior is predicated upon its receiving appropriate inputs in response to 
its actions. We proceed to show that LF preserves well-formedness so that we are guaranteed that 
every client’s behavior is also a well-formed sequence of actions. 


Theorem 2.1 For allp € V, LF preserves p-well-formedness. That is, letting ®, = ext(client(p)), 
for all p € V, if B|, € W,, and m € out(LF) is such that Brlext(LF) € finbehs(LF), then 
Br|@, € Wz. 


Proof: For all p, if 7 ¢ @,, or 8|, is the empty sequence of actions, the theorem holds trivially. 
If not, = client_input(p, M, i) for some M and i. Since ®, C acts(LF) and Brlext(LF) is finite, 
8|®, is finite too, and has a last action. 

Let p be that last action in 6|@,. We need to show that p = client_output(p, M',i) for some M'. 

As mentioned in the section on Preliminaries, all the component automata of LF are deter- 
ministic and none of the automata whose composition is ZF has internal actions. The projection 
Blext(LF) € finbehs(LF) thus determines a unique execution of LF, and hence a unique state 
(say s’) that LF will reach after that execution. Since Bmlacts(LF) € behs(LF), we have, from 
the code implementing the client_input action in front_end(p), s'[front-end(p)].gorecd(t) = true. 
The state variable go-recd(i) is set to true iff go(p,z) is received from loc_.synch, and hence we 
examine the preconditions governing the execution of that action, in loc_synch. 

One precondition ensures that s’[loc_synch].ok_recd(p, i) = true. (We exploit our implementa- 
tional convention that boolean values, once set to true in an execution, remain so for the remainder 
of that execution). This, in turn, implies that in front_end(p), s'[front_end(p)].ok_sent(i) = 
true, which is predicated upon s'|front_end(p)].client_output_recd(i) being true. The action 
client_output(p, *,7) therefore occurs in f. 

Action client_input(p, *,7)’s preconditions imply that it has not occurred in ( if @x|acts(LF) 
occurs in finbehs(LF). We can now infer that p = client_output(p, *,i). For, if not, let p’ be 
the action immediately following client_output(p, «,7) in B|@p. Since B|&, € Wy, p’ is of the form 
client_input(p, *,7), which is a contradiction. a 

Another useful property of EC is that actions are never repeated in its behaviors. We prove 
that now. 


Theorem 2.2 In any € behs(LC), for any p,q andi, each of ack'"?(q, p, i), ack™(p, q, i), ok(p, 7) 
and go(p,i) occurs at most once in 8. Furthermore, for any p,q and i, there can be at most one 
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action of each of the following forms in B: 
client_output(p, *, i), send™ (p, q, *, i), send'"?(p, q, +, i), client input(p, *, 1). 


Proof: 

Each client_output(p, *,i), for any p and i, is the output action of a client that, by defini- 
tion, preserves well-formedness. It is thus guaranteed that i (which we interpret to be the round 
number) takes on increasing values with successive occurrences of the action, and hence that 
client_output(p, *,7) occurs at most once. 

Every other action in G is a locally controlled action of one of the component automata of LC’ 
(since no component automaton uses internal actions). The “code” implementing each ensures 
non-repetition, as can be verified by inspection. a 

We will now describe the dependency relation. With each schedule of LC there is a relation —, 
that captures the ordering that LC imposes upon the actions in 8. The key property of this relation 
is that for any schedule G of the implementation system, and any permutation /’ of 8 that preserves 
the partial order of events given by —,, we have that 3’ is also a schedule of the implementation 
system. In other words, +, captures enough about the dependencies in the schedule to ensure that 
any reordering that is consistent with these dependencies is still a valid schedule of the system. 

We give a formal definition of the relation —,. 


Definition 2.2 Given a schedule 8 of LC and two actions 7 and p that occur in B, the pair (r,p) 
belongs to the relation +, (we write t >, p) ifm and p have the form of one of the following pairs, 
for any p € V, for any q € neigh(p), for any q' € neigh(p) U {p}, for any sets of tagged messages 
M and M', for any set of messages pp, and for anyi > 0: 


client_input(p, M’,1) and client_output(p, M,i+ 1), 
client_output(p, M,i) and send™*(p, q, 4,7), 
send™(p, q, p, 7) and send'"?(p, q, u, 7), 
send’? (p, q, #,7) and ack™(q, p, 2), 
ack™+(q, p,i) and ack*”?(q, p, i), 
ack"? (q, p,i) and ok(p, i), 
client_output(p, M,7) and ok(p,7), 
ok(q',i) and go(p,i) or 
go(p,t) and client_input(p, M, i). 


We write p1 5 P2 >g°**—p Pr to mean p; > pi4i for alli, O< t<r 


Definition 2.3 Given any behavior B of LC, we write 7 ~~, p to imply that actions m and p occur 
in that order in B, though not necessarily consecutively. 


Remark: Note that since Theorem 2.2 ensures that there are no repeated occurrences of any action, 
for any distinct actiona 7 and p that occur in §, exactly one of 7 ~~», p and p~», 7 will be true. 

In lemmas to follow, we will use the definitions given above and associate the dependency 
relation with the relative order of occurrence of actions within a schedule. We will first show that 
any for two actions 7 and p of a schedule ( of LF such that 7 —, p, it will also be the case that 
T 4g p. 
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Definition 2.4 A permutation ' of B € behs(LC) is said to be consistent with the relation —, 
(>,-consistent) if for any distinct actions 7 and p in B, m ~»g: p whenever T >, p. 


We will then use these lemmas to show that given any permutation of a schedule of the imple- 
mentation automaton continues to remain a schedule if the relative ordering of actions is consistent 
with the dependency relation. 


Lemma 2.3 Let 8 € behs(LC). For any p,q, pu, M and i of the right type, 


1. If client_output(p,M,i+ 1) occurs in B then there exists a set M' of tagged messages such 
that client_input(p, M',1) ~, client_output(p, M,i+ 1). 


2. If send™* (p,q, u,i) occurs in B then there exists a set N of tagged messages such that 
{(m,q)|m € u} C N anp client_output(p, N,i) ~+, send™*(p, q, , i). 


3. If send‘? (p,q, u, i) occurs in B then send™*(p, q, 1,1) ~», send’”? (p,q, 1, i). 


4. Ifack™*(p, q,%) occurs in B then there exists a set v of messages such that send’? (q, p, V, i) ~%, 
ack™* (p, q, i). 


5. If ack*"?(p,q,1) occurs in B then ack™*(p, q, i) +, ack'"?(p, q, 1). 


6. If ok(q,) occurs in B then, for all p € neigh(q) for which there exists a set v of messages 
such that send™*(q, p, v,i) occurs in 3, it will be the case that ack'”?(p, q, 1) ~»g ok(q, 1). Also, 
there exists a set N of tagged messages such that client_output(p, N,i) ~~, ok(q, 4). 


7. If go(p, 2) occurs in B then Vq such that q € neigh(p) U {p}, ok(q,7) 2 go(p, i). 
8. If client_input(p, M,i) occurs in B then go(p, 1) ~», client_input(p, M, 1). 
Proof: 


1. This follows directly from the fact that LF and all the clients preserve W,, and hence 
Vp, Blclient(p) € Wp. 


2. Verified by inspecting the code implementing actions client_output and send™ in front_end(p). 


3. send'"?(p, q, #,2) is an output action of L(p — q). In order for this action to be enabled, 
send™* (p, q, #, 7) must have occurred at an earlier point in /. 


The other cases are verified similarly. | 
The main goal of this subsection is to show that any permutation of 6 € behs(LC) that is 
consistent with —, is also a behavior of LC’. To this end, we will need the lemmas given below. 


Definition 2.5 The transitive closure of the +, relation (written +,) is defined the usual way. 


Thus 7 ++, 0 implies 4p; i= 1,---,7, r > 1, such that pj = 7, p, = o and pi 4g p2 2 °** > Pr- 
Consistency of a schedule with respect to ++, is defined in the way —>,-consistency is, and we 
state the equivalence of the two notions in this lemma: 
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Lemma 2.4 Every —,-consistent permutation of B € behs(LC) is ++,-consistent. 
Proof: By induction. a 
Lemma 2.5 Every behavior B of LC is +,g-consistent. 


Proof: Follows from Lemma 2.3.1-8. For example, let us verify that Vp € V, Vq € neigh(p), 
if ack'"?(q, p,1) 3, ok(p,i), then in 8, ack*”?(q, p, i) ~», ok(p,i). Since ack'"?(q, p, i) 4, ok(p, 4), 
ack'™P (q, p, 7) occurs in 8. By successively applying Lemma 2.3.5 and Lemma 2.3.4, we get dv such 
that send’’?(p,q, v,i) occurs in 6. Then by Lemma 2.3.3, send™(p, q, v,7) occurs in 8. Thus, by 
Lemma 2.3.6, we get ack*"? (q, p, i) ~», ok(p,z). Other cases follows similarly. a 


Corollary 2.6 Every behavior B of LC is +>,-consistent. 


We will use these results to to prove the main theorem of this subsection: 


Theorem 2.7 Given 8 € behs(LC), let B' be any permutation of 8 consistent with +3. Then 
f' € behs(LC). 


Proof: To prove that 8’ € behs(LC), it suffices to show that when §’ is projected on each 
component automaton of LC, the resulting sequences are behaviors of executions of the respective 
automaton [3]. We will provide the proof in the case when the component automaton is a front-end, 
and omit the easier proofs that establish the result for the other component automata of LC. 
Our proof will be based, in part, on the following fact: 
Proposition Let 6 be any schedule of LC. If send'"?(q, p, 1,7) and client_input(p, M, i) occur in 
8, then 
send”? (q, p, ft, 1) 4g client input(p, M, i). 


Proof: We successively apply sub-lemmas 2.3.8 and 2.3.7, in that order, instantiating quantifiers 
as the notation suggests. From sub-lemma 2.3.3, send™*(q, p, u, 7) occurs in § and hence from 2.3.6, 
ack'"?(p, q, i) occurs in 8. Applying 2.3.5 and 2.3.4, we find that there exists an v such that actions 
send"? (q, p, v, i), ack™(p, q, 7), ack'"?(p, q, 7), ok(q, 7), go(p, 2), and client_input(p, M, 1) all occur 
in 6. By Theorem 2.2 v = p. Since each successive pair of actions in the above sequence is related 
by +,, send"? (q, p, 1,7) ++, client_input(p, M, i). | 


Lemma 2.8 Vp € V, §'|front_end(p) € behs(front_end(p)). 


Proof of lemma: The proof is by induction on the length of 6’. In the case when |§’| = 0, the theo- 
rem holds trivially. Otherwise let 6 be a prefix of 9’ such that 6| front_end(p) € behs( front_end(p)). 
Consider y = dz, where x € acts(front_end(p)) is the next action in 6’ (We need to look only at 
such actions). 

If 7 is an input action, y € behs(front_end(p)), as inputs are always enabled. If not, 7 is 
one of client_input(p, M, i), send™'(p, q, 1,7), ack™*(p, q,7) or ok(p,7), for some q, #4, M and i. We 
need to show that in each case, that particular output action must be enabled. A proof for the 
case when 7 = client_input(p, M,i) follows. If * = client input(p, M,i), by Lemma 2.3.8, go(p, 2) 
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occurs before 7 in @. Since ' is consistent with —+,, the same order of occurrence is maintained 
in y. By Theorem 2.2 we know that no action of the form client_input(p, *,7) has occurred at an 
earlier point in y. We must now show that M will have the correct value. In other words, we need 
this claim: 

Claim Yq, such that send"? (q, p, 1,2) occurs in 8, 


send’”? (q, p, jl, i) ~», client_input(p, M, i) iff send’”?(q, p, 1, 1) ~+, client_input(p, M, i). 


Proof of claim: 

(=) 

By the Proposition and Lemma 2.4. 

(<=} 

By Corollary 2.6 and the Proposition. a 
Hence the preconditions for the client_input(p, M, 1) output action of automaton front_end(p) have 
been satisfied, and the action is currently enabled, i.e. ym € behs( front_end(p)). 

When 7 is send™(p, q, 1,7) for some p, or ack™*(p,q,7) or ok(p,i), the fact that m is enabled 


follows immediately from the appropriate case of Lemma 2.3. a 
Similar proofs for each of the other component automata in LC show that 8’ projects correctly on 
each. Hence fi’ € behs(LC). a 


2.3.2 Reordering yields synchronized behaviors 


We will now specify a way of reordering the actions of any behavior 8 of LC while maintaining 
—, such that the resulting 6’ € behs(LC) will “look like” a behavior of GC. In particular, the 
reordered schedule will correspond to executions of LF where, at every round, front-ends send out 
inputs to clients only after they have confirmed that all the tagged messages in every client’s output 
have been delivered to their destinations. 

We shall call such behaviors “synchronized”, and show how to construct the synchronized 
counterpart of any behavior of ZC. In the next subsection, we will prove that upon projection 
onto GC, any synchronized behavior of ZC will result in a behavior of GC. We will then show 
that given any behavior G of LC, no client will be able to distinguish 8 from the behavior of GC 
obtained by projecting onto GC the synchronized counterpart of /. 


Definition 2.6 A behavior 8 € behs(LC) is said to be a synchronized behavior if for alli, and for 
all p,q such that actions ok(p,1) and go(q,2) are in B, ok(p,i) ~»g go(q,2). 


Definition 2.7 Given 6 € behs(LC), let 
e actions(8,i) = {m occurs in B|m is tagged with round number i}. 
e client_inputs(B,i) = {x|4M, p such that 1 = client_input(p, M,i) occurs in B}. 
e oks(G,i) = {x| for some p, 7 = ok(p,i)}, 
e gos(B,2) = {x| for some p, x = go(p,i)}. 


e other_acts(8,1) = {x|m occurs in B and is not in client inputs(G,i), oks(G,i) or gos(G, 7)} 
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We now provide a way of extracting from a behavior 8 of LC, a synchronized behavior of LC. 


Definition 2.8 If 8;, i= 1,2,---,n are finite sequences of actions, let ©7_, {Pi} be the sequence 
concatenation 3182--+By. 


Theorem 2.9 For any behavior 8 € behs(LC), consider 


B' = G){Blotheracts(, i) - Bloks(8, 7) - B\gos(B, 1) - Blclient_inputs({, t)}. 


it=1 
B' is a synchronized behavior of LC. 


Proof: 3’ is a permutation of § as every action of LC is tagged with a round number and so 
is an element of exactly one actions(8,i) for some i, and other_acts(G,i), oks(B,7), gos(B,7) and 
client_inputs(G,7) partition actions(f, 7) for each 1. 

We need to show that ’ is consistent with the >, relation. First, we observe that by Lemma 
2.5, 8 itself is consistent with —, relation. 

Consider relations involving actions with the same round number, say 7. It can be checked that 
the above reordering violates none of the —,-consistency requirements, as none of the relations 
involving actions ok, go and client_input are violated. 

Turning to orderings involving actions with different round numbers, all we have to show is that 
orderings of the form client_input(p, *,1) +, client_output(p, *,i+ 1) are maintained. But this is 
true by construction. 

8' is thus consistent with the >, . Hence, by Theorem 2.7, 8’ € behs(LC). §’ is synchronized by 
construction, and this completes the proof. a 


2.3.3. Synchronized behaviors are globally synchronous 


Given 8 € behs(LC), let 6’ be the synchronized behavior obtained after applying Theorem 2.9 to 
B. We will now extract from ’ a behavior y of GC which is such that with respect to each client 
automaton, y and § are indistinguishable. 

We will take y to be f'|acts(GC), and first show that 


7 = f' \acts(GC) € behs(GC). 


We consider a restriction of automaton loc_synch that will enforce synchronized behavior. We will 
show that the composition of the front-ends and the restricted loc_synch automaton will generate all 
the synchronized behaviors of LC. This will be followed by a straightforward possibilities mapping 
proof showing that this composition implements GC. We will thus have proved that y is a behavior 
of GC’. We will then conclude with the proof that no client can tell y and 6 apart. 


Theorem 2.10 If 8’ € behs(LC) is synchronized then B'\acts(GC) € behs(GC). 


Proof: Consider a restriction of loc_synch to Rloc_synch, where automaton Rloc_synch has the 
same signature and states as loc_synch, and executes the same transitions that loc_synch does on 
all but one of loc_synch actions. The automaton Rloc_synch will differ from loc_synch only in 
its implementation of the action go(*,*). The restricted version of the loc_.synch automaton will 
ensure that all front-ends are “safe” before it gives any front-end permission to send its client that 
rounds’ input. 
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Rloc_synch 


Input: 
ok(p, i),p EV 


Output: 


go(p,t), pEV 
Internal: 
none 


State 


array go_sent(p,i), pEV 
array ok_recd(p,i), pe V 


Transitions 
ok(p, 2) 
Effect: 
s.ok_recd(p,i) = true 
go(p, 2) 
Precondition: 
for all q € V, s'.ok-recd(q,2) = true 
s'.go_sent(p,t) = false 
Effect: 
s.go_sent(p,2) = true 
Partitions 


All go(p, *) actions in one class, for each p € V. 


Let 
RLC = Rloc_synch - Il front_end(p) - client(p) 
peVv 


Then the following claim is supported by a direct proof: 


Claim Every synchronized behavior of LC is a behavior of RIC. 

Proof of claim: Indeed, a behavior 3 of LC may diverge from being a behavior of RLC only at an 

occurrence of go(x, *) as it is this action alone that needs a different (stronger) set of preconditions 

satisfied in REC. But it is this very case that is taken care of in the requirement that @ be 

synchronized. Hence the proof follows. a 
The theorem will stand proved if we show that REC implements GC’. 

Let & = {mn € acts(RLC): 7 ¢ acts(GC)} and RLC = hideyRLC. 


Lemma 2.11 Consider f defined as follows: Vs € states(RLC), 
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f(s) = {t © states(GC) | for all p and i, 
t.client_output_recd(p, 7) = s|front_end(p)]|.client_output_recd(i) 
t.client_input_sent(p, i) = s[front_end(p)|.client input_sent(i) 
t.tray(p, t) = U,eneigh(p) {(™, 7) | s[front_end(r)].pkt_for(p, i) = true and 
m € s[| front_end(r)|.outboz(p, z)}. 


Then f is a possibilites mapping from RIC to GC. 


The above proof will need the two claims that follow. The truth of these claims are established by 
operational means, and are omitted here. 


Claim 2.1 Fors € states(RLC), if s.[front-end(p)].gorecd(t) = true for some p € V, then 
1. s.|front_end(q)}.client_output_recd(i) = true for allq EV. 


2. s.[ front_end(p)].inboz(i) = Ureneigh(p) {(™, 7) | s[front_end(r)].pkt_for(p, 7) = true and 
m € s[front_end(r)|.outboz(p, i) }. 


Proof of lemma: If s € start(RIC) then the existence of a start state in GC consistent with 
f is easily verified. Let s’ be a reachable state of RLC, t' any state in f(s’) and (s',7, s) a 
step of RLC. We need to produce an extended step of GC (t', y, £) such that ¢ € f(s) and 
ylext(LS) = mlext(RLC). We handle cases differently depending on the kind of action 7 is. 

When = is a hidden action, that is, 7 € D, then y will clearly have to be a sequence of no (zero) 
actions for f to be obeyed. It can be seen that no hidden action changes the values of the variables 
that determine the action of f. Hence we can take t = 7’. 

Otherwise, if 7 is an input action of RLC and GC, meaning that 7 = client_output(p, M, 7) for 
some p, M and 2, we let y = 7. By inspection, it can be seen that the final state t of GC’ will be 
in f(s). 

If not, 7 = client_input(p, M,i) for some p, M and i. Again, taking y = a and then using 
Claim 2.1, we can see that the preconditions of 7 are satisfied in GC if they are met in RIC. 
Further, the occurrence of this action sets the client_input_sent variable to true thus ensuring the 
continued consistency of f. 

Thus f is a possibilities mapping, and behs(RLC) C behs(GC). a 

We thus prove the main theorem of this section. 


Theorem 2.12 Let 8 be any behavior of LC’. Then there exists y € behs(GC) such that for all 
pe V, B\Client(p) = y|Client(p). 


Proof: By Theorem 2.9 and Theorem 2.10 there exists §’, a permutation of § such that y = 

B'\acts(GC) € behs(GC). Since both f and 7 are consistent with +,, the relative ordering of all 

client_output(p, *, *) and client_input(p, *, *) actions in 6 is maintained in y for any p in V, and 

since y is a projection of a permutation of 6, the contents of the tagged message sets exchanged 

with the clients is not altered. Hence the two sequences project identically on each client. a 
We will now examine LC and sketch a proof of fairness of the implementation. 
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2.4 Fairness of implementation 


We have shown that behaviors of LC generate behaviors of GC. Since glob_synch is deterministic 
in its functioning, such behaviors yield unique executions of that automaton. We will now argue 
why such executions obtained from fair behaviors of LC are fair executions of glob_synch. 

Let 8 be any fair behavior of LC, and let y be a behavior of GC which projects as G does on 
all the clients, as in Theorem 2.12. We assume also that y is constructed as in that theorem, by 
considering behavior 6! of RLC, and let [' be any execution of GC that has y as its behavior (Note 
that since the clients have internal actions that we have no information about, distinct executions 
of GC may yield indistinguishable behaviors). If [ is not a fair execution, it can be shown that 
B' will not be a fair behavior of REC. This will imply that 6 is not a fair behavior of LC, thus 
deriving a contradiction. 

Combining the fairness argument with Theorem 2.12, we then have 


Theorem 2.13 Let 3 be any fair behavior of LC. Then there exists y € fairbehs(GC) such that 
for allpe V, B|\Client(p) = y|Client(p). 


We will first show that the global synchronizer preserves the well-formedness property, and 
then, using this fact, prove that if the behavior y of GC generated is unfair, then the behavior 
of RLC that y was “extracted” from, is also unfair. 


Theorem 2.14 Automaton glob_synch preserves p-well-formedness, for all p€ V. That is, for 
each p in V, letting 6, = ezt(client(p)), if B|&, € Wp, and m € out(glob_synch) is such that 
Ar|ext(glob_synch) € finbehs(glob_synch), then Br|®, € W,. 


Proof: The proof of this theorem resembles that of Theorem 2.1 strongly, 

For each p € V, if r ¢ &, or A, is the empty sequence of actions, the theorem holds trivially. 
If not, = client_input(p, M,i) for some M and i. Since @, C acts(LF) and Srlext(LF’) is finite, 
8\®, is finite too, and has a last action. 

Let p be that last action in 6|@,. We need to show that p = client_output(p, M',i) for some M’. 

Since glob_synch is a deterministic I/O automaton, and has no internal actions, the projection 
Blext(glob_synch) thus determines a unique execution, and hence a unique state (say s’) that 
glob_synch reaches after that execution. Since Brlext(glob_synch) € behs(glob_synch), we have, 
from the code implementing glob_synch, s'|glob_synch].client_output_recd(p,i) = true. Thus the 
action client_output(p, *,7) occurs in §. 

Action client_input(p, *,7)’s preconditions imply that it has not occurred in @ if Brlacts(LF) 
occurs in finbehs(LF). We can now infer that p = client_output(p,*,i). For, if not, let p’ be 
the action immediately following client_output(p, x, i) in B|@,. Since B|®, € W,, p’ is of the form 
client_input(p, *,7), which is a contradiction. 

a 

We now proceed to the theorems that establishes the fairness properties of the implementation. 

Let §' be a behavior of automaton RIC, extracted from a behavior 8 of LC as per Theorem 2.9. 
Behavior 3’ is a synchronized, as can be checked from the code implementing REC. By Theorem 
2.10, y = 8’ |acts(GC) is a behavior of GC. 


Theorem 2.15 If 8’ is a fair behavior of RLC, then y is a fair behavior of GC. 
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Proof: 

If not, let y be an unfair behavior of GC, and p € V be such that actions from the class of 
actions of the form client_input(p, +, *) are enabled but never performed in y. Then there will be 
a smallest i such that client_input(p, *,i) is enabled, but never performed in I (an execution of 
GC whose behavior is y) because once an action of that form is enabled, it is not disabled unless 
it is performed. It follows that client_input(p, +,7) does not occur in 6’. Since the automaton 
glob_synch preserves well-formedness with respect to all the clients, and the clients preserve well- 
formedness, we hence have that client_output(p, +,i+ 1) will not occur in I’, and hence not in fs’. 
From the code implementing the automaton RIC, this in turn implies that no action of the form 
client_input(*, *,i+ 1) will occur in (’ (the intuitive reason is that RLC has been built to permit 
front-ends to respond with a round’s messages to the client only if all the clients have sent in their 
messages for that round), thus putting an end to the exchanges between any front-end and its 
client. Thus if y is an unfair behavior of GC, then y is finite. 

We consider the case when 7 is finite. If a finite behavior y is unfair, then it follows that in the 
final state ¢ of I’, an action of the form client_input(p, *, 7) is enabled, for some p € V. 


Let a’ be an execution that has §' as its behavior. We know that in GC, 
t.client input_sent(p,i) is false, and hence that in front_end(p), the corresponding 
client_input_sent variable is false throughout that execution. In order that a’ not have enabled 
actions in the final state s of automaton RIC, it follows that s{front_end(p)|.go-recd(i) has to be 
false. In turn, in order that loc_synch not have the go action enabled in s, s[loc_synch].ok_recd(q, t) = 
false for some q € V. Working backwards in like fashion, we arrive at the fact that for some 
r € neigh(q), s[front_end(q)|.pkt_for(r,i) = true, but s[front_end(q)].ack-recd(r,i) = false. 
This, however, can be seen to be to imply that a’ is not a fair execution of automaton RLC, thus 
completing the proof. a 

Thus we have proved that if y is unfair, so is 6’. We now proceed to show that if J’ is unfair, 3 
(the behavior of LC that was permuted to yield G') is not fair, thereby proving that fair behaviors 
of LC correspond to fair behaviors of GC, as far as each client can tell. 


As before, let behavior 6’ of RLC correspond to an execution a’ of RLC, and {8 € behs(LC) 
correspond to an execution a of LC. We will show that if an action is “permanently enabled” in 
a’, then the same is true of that action in a. 


Theorem 2.16 If a is a fair execution of LC, then a’ is a fair execution of RLC. 


Proof: Ifa’ is an unfair execution of RLC, then there exists a class of actions C, and a state s in 
a’ such that in s and all subsequent states (if any), some action from that class is always enabled. 

We first consider the case when C is the class of actions of the form client_input(p, «, *) for some 
pin V. There will be a smallest 7 such that client_input(p, *,7) is enabled in state s of a! and all 
subsequent states, but never performed. Thus client_input(p, *,7) never occurs in a’, and hence not 
in @ as well. Since that action is enabled, it follows that s[front-end(p)|.go_recd(z) is true, meaning 
that go(p, 7) occurs in a’ and a. Thus in a, the action go(p, i) occurs, while client_input(p, *, 7) 
never does. But this enables the latter action permanently, thus making a unfair. We have thus 
shown that if a’ is an execution unfair to this choice of C, so is a. 
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Similar proofs apply when C is a class consisting of actions of any of the other possible forms: 
send’”?, send, ack*”?, ack, ok, go or client_input. This completes the proof. a 

Thus we have shown that from the point of view of each client, fair behaviors of the implemen- 
tation automata are indistinguishable from fair behaviors of the global synchronizer. 

The rest of this report will deal with fashioning a provably correct fair distributed implemenation 
of the local synchronizer loc_synch. 


3 Implementing a Network Synchronizer 


In this section, our task will be to provide a fair, distributed implementation of the automaton 
loc_synch described in Section 2.2. Our reasons for doing so are straightforward. From Theorem 
2.13 of the earlier section, we know that fair behaviors of clients and front-ends interacting with 
loc_synch “look like” fairs behaviors of GC to each client. A distributed and fair node-level imple- 
mentation of automaton loc_synch will thus leave us with a synchronizing system whose behaviors 
are indistinguishable from those of GC, as far as each client is concerned. 

The automaton loc_synch is implemented in two stages. We partition the graph into clusters of 
node, and assume, in the first stage, that we have some means of synchronizing all the nodes in each 
cluster. We then describe the process of synchronizing all the clusters. In the second stage of our 
implementation, we consider the problem of synchronizing the nodes inside each cluster (we provide 
node-level implementations) and the problem of giving node-level implementions of the intercluster 
synchronization process of the earlier stage. We will deal with fairness issues in each stage, thus 
ensuring that, at the end of the section, we have a fair and correct node-level implementation of a 
network synchronozer. 


3.1 Cluster Level Synchronization 


In this section we will describe a set of automata that will implement loc_synch. 

The graph is divided into clusters of nodes, each with its own spanning tree (we assume that this 
spanning forest is constructed with some preprocessing). The synchronizing algorithm is derived as 
a composition of a simple synchronizer (called § in [1]) executing within each cluster, and another 
simple synchronizer (called a) that synchronizes between the clusters. 

Each cluster implements a distributed algorithm that interfaces with the front_end of each 
client automaton associated with each node in the cluster. At each pulse, this algorithm, an inétra- 
cluster synchronizer, receives and exchanges messages and detects the condition that all messages 
of the underlying synchronous algorithm for that pulse sent out by each node in the cluster have 
been received. The algorithm now passes this information on to an intercluster synchronizer. 

When the intercluster synchronizer receives this information from all neighboring clusters and 
the cluster in question, it gives the intracluster synchronizer permission to allow nodes in its cluster 
to execute the next pulse of the synchronous algorithm. Note that this behavior closely parallels 
that of loc_synch, the automaton that enforces locally synchronous behavior between neighboring 
clients. A pictorial representation of the cluster level implementation follows in figure 3. 

We begin by specifying the implementation automata that will carry out the above tasks, and 
subsequently prove the correctness of the implementation using mapping techniques. We will then 
briefly argue that fair behaviors of the implementation will given rise to fair behaviors of the 
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Figure 3: Automata implementing the local synchronizer 


implemented system. In later sections, we will give distributed implementations of the intracluster 
and intercluster synchronizers described here. 

We call “forest” the spanning forest that is assumed to be given. If C' is a subtree of forest, 
neighboring trees(C) is the set of trees in forest that edges from nodes in C are incident upon, 
and nodes(C) is the set of nodes in C. With each subtree C' of forest we associate an automaton, 
an intracluster synchronizer (cluster_synch(C)) that does the job of synchronizing the clients on 
the nodes in the cluster. We give the formal specifications of an arbitrary intracluster synchronizer 
below. 


3.1.1 Cluster level synchronizer 


Automaton cluster_synch(C), C a subtree of forest: 


Signature 


Input: 
ok(p,t),p € nodes(C) 
cluster_go(C, 2) 
Output: 
go(p,t),p € nodes(C) 
cluster _ok(C, 7) 
Internal: 
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none 


State 


array go.sent(p, i), p € nodes(C’) 
array ok_recd(p, 7), p € nodes(C) 
array cluster_go-recd(i) 
array cluster_ok_sent(7) 


Transitions 
ok(p, 2) 
Effect: 
8.ok_recd(p,t) = true 
cluster _ok(C, 7) 
Precondition: 
Vp € nodes(C), s'.ok_recd(p,z) = true 
s.cluster_ok_sent(t) = false 
Effect: 
s.cluster_ok_sent(z) = true 
cluster _go(C, 7) 
Effect: 
s.cluster_go-recd(t) = true 
go(p,?) 
Precondition: 
s' cluster_go-recd(i) = true 
s'.go_sent(p,t) = false 
Effect: 
s.go_sent(p,7) = true 
Partitions 


All go(p, *) actions in one class, for each p € nodes(C) 
all cluster _go(C, *) actions in one class. 


3.1.2 A synchronizer for clusters 


The intercluster synchronization part of the implementation is carried out by an automata called 
forest_synch, as described earlier. As mentioned earlier, its functioning, captured in the formalism 
given below, resembles that of LS closely. 


Input: 

cluster _ok(C,z),C a subtree of forest 
Output: 

cluster_go(C,i),C a subtree of forest 
Internal: 

none 
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State 


array cluster_ok_recd(C, i) for all C a subtree of forest 
array cluster_go_sent(C, i) , for all C a subtree of forest 


Transitions 
cluster_ok(C, 1) 
Effect: 
s.cluster_ok_recd(C,i) = true 
cluster _go(C, 7) 
Precondition: 
VC" € netghboring_trees(C) U {C}, 
s' cluster .ok_recd(C",i) = true 
cluster_go_sent(C,i) = false 
Effect: 
cluster_go_sent(C,1t) = true 
Partitions 


All cluster _go(C, x) actions in one class, for each C' a subtree of forest. 


3.2 Proof of implementation 


We will now prove that the compostion of forest_synch and the cluster_synch automata (call it 
CF) implements loc_synch, in the sense involving behavioral inclusion. The possiblities mapping 
proof we provide will show that every behavior of CF’, when projected onto loc_synch, will yield a 
behavior of the latter automaton. 

We first state a lemma whose truth follows directly from definitions. 


Lemma 3.1 For all p,C such that p € nodes(C), 
{p} U neigh(p) C {q| q € nodes(C") such that C' € neighboring trees(C) U{C}}. 


Let © = {m € acts(CF): m = cluster_go(+, *) or cluster_ok(*, *)}, and CF = hideyCF. We 
will need the following lemmas for the main theorem of this section that guarantees implementation. 


Lemma 3.2 For any p€ V, let C be the cluster such that p € node(C). 
If CF is in state s such that s[cluster_synch(C)].cluster_go_recd(i) = true then 
YC’ € neighboring -trees(C) U {C}, s[forest_synch].cluster_ok_recd(C", 1) = true. 


Lemma 3.3 For any C a subtree of forest, if CF is in state s such that 


if s[ forest_synch].cluster_ok_recd(C, i) = true, then Vq € nodes(C), 
s[cluster_synch(C)].ok_recd(q, t) = true. 
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Both lemmas follow from operational proofs which are omitted here. 

Using these lemmas, we will show a possibilities mapping between CF and loc_synch which will 
prove that every behavior of CF is a behavior of loc_synch. Since the behaviors of CF are plainly 
those of C'F projected onto loc_synch, we will have proved the correctness of the implementation. 


Theorem 3.4 Consider f defined as follows: Vs € states(CF), 
f(s) = {t € states(loc_synch) | Vp eV 


t.go_sent(p, i) = s[cluster_synch(C)]|.go_sent(p,7) and 
t.ok_recd(p, i) = s[cluster_synch(C)|.ok_recd(p, 1), 


where C is the cluster such that p € nodes(C)} 
Then f is a possibilities mapping from CF to loc_synch. 


Proof: If s € start(CF) then the existence of a start state in loc_synch consistent with f is easily 
verified. Let s’ be a reachable state of CF, t' any state in f(s’) and (s',7, s) astep of CF. We need 
to produce an extended step of loc_synch (t', y, t) such that t € f(s) and y|ext(LS) = r\ext(CF). 
We have the following cases depending on 7: 


e « = ok(p,i) for p € V. Then take y = 7. Since ok(p,i) is an input action, for some t, 
(t', y, t) isa step of LS. From the code describing the transitions of [Sand cluster_synch(C) 
(as above), and given that t’ € f(s’) we can infer that ¢ € f(s). 


e = = go(p,i) for p € V. Again y = 7. If go(p,i) is an enabled output action in s’, it is an 
enabled action in loc_synch as well: We need to show that t'.go_sent(p,i) = false and that Vq 
such that q = p or q € neigh(p), t'.ok_recd(q, i) = true. The first condition is guaranteed by 
the mapping. We establish the truth of the second proposition now. Since go(p,7) in enabled 
in CF, s'[cluster_synch(C)].cluster_go_recd(i) = true. 


From Lemmas 3.2, 3.3 and 3.1, the corresponding ok_recd variables in t'[loc_synch] are true. 
go(p, 2) is therefore enabled when loc_synch is in state t’. Let t be the state after go(p, 7) has 
been performed. Clearly t € f(s), and the mapping is legal. 


e 7 = cluster_go(*,*) or cluster_ok(*, *). Since these are both hidden actions, the mapping 
must leave t unchanged, with y being a sequence of zero actions. This is appropriate as neither 
action affects variables of the form s[*].go_sent(x*, *) or s[*].ok_recd(*, *), which determine the 


Mapping. 


After having thus shown that behaviors of CF project onto loc_synch as the latter automaton’s 
behavior, we proceed now to argue that fair behaviors of C'F will generate fair behaviors of loc_synch 
too. 


3.3 Fairness of Implementation 


Let @ be a behavior of CF interacting with clients, their front-ends and all the link automata, 
and y be such that 7 = 8\loc_synch is a behavior of loc_synch. As before, we let T be the unique 
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execution that y is the behavior of (I’s uniqueness is a consequence of the implementing automata 
being deterministic, and of their not having internal actions). 

We will show that Ts being unfair will imply that @ is not a fair behavior. We thus derive a 
contradiction to our assumption that is a fair behavior of CF, and complete the fairness analysis. 


Theorem 3.5 If G is a fair behavior of CF interacting with clients, their front-ends and the 
bridging link automata, then y = B\loc_synch is a fair behavior of loc_synch. 


Proof: We know from Theorem 3.4 that 7 is a behavior of loc_synch. Assume that [ the unique 
execution of that has y as its behavior, is not a fair execution of loc_synch. Then there is a smallest 
i such that for some p € V, go(p, 7) is enabled, but never occurs in I’. 

Since for all j < 7, go(q,j) occurs in I’ for each q in V, the action client_input(q, *,i — 1) is 
enabled at some point in 8, and will occur in 8. T. By Theorem 2.1 actions client_output(q, +, 7) 
actions therefore occur in § for each g € V, and hence in I as well. 

Let p be in the cluster C in forest. If action go(p,i) does not occur in CF, it follows that for 
some node q in C or one of its neghboring clusters (that is, the set {r| r € nodes(C’) such that 
C’ € neighboring trees(C) U {C}}), ok(q,2) does not occur in I either, as otherwise 6 would be 
unfair too. But if ok(q,7) is never enabled in I’ even after client_output(q, *,7) has occurred, it 
follows that client(q)’s packet meant for a neighbor is never acknowledgded. This however, can 
happen only if 8 is unfair, thus deriving a contradiction. a 

In subsequent sections we will provide distributed implementations of the cluster_synch and 
forest_synch automata that are formal specifications of synchronizer ( (at the intracluster level) 
and @ (at the intercluster level) respectively, as described in Awerbuch’s paper. Note that we 
can, in principle, use the distributed versions of arbitrary synchronizers to perform the task at the 
two levels. All we have to ensure is that the cluster_ok and cluster_go actions shared by the two 
specification automata can be communicated in the network between the implementations. In this 
paper we restrict ourselves to implementing a and § synchronizers that Awerbuch uses, where the 
problem of communicating is simply solved (as will be made clear shortly) by having the automata 
that share these actions run on the same physical node in the network. 

We close this section with a a brief overview of the remainder of the report. 

The next section will detail the distributed implementation of cluster_synch automata, and the 
one after that, a mapping proof that will prove the implementation correct. This will be followed 
by an implementation of the automaton forest_synch, and the corresponding correctness proof in 
Sections 3.7 and 3.8. Finally, we end with a summary in Section 4. 


3.4 Intracluster Synchronization 


Our task in this section will be to give a distributed implementation of the intracluster synchronizer 
cluster_synch(C) operating in a cluster consisting of nodes of a subtree C' of the given spanning 
forest of G. 

Automaton cluster_synch’s function is to detect, after the commencement of each round, the 
condition that messages of the underlying synchronous algorithm sent out by each node in its 
cluster have been received. That is, detect the condition that the front ends at all the nodes in the 
cluster have output an ok action for that round. 
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Figure 4: (Non-)Root Node Automaton of the intracluster synchronizer 


At each non-root node of C, we have a component automaton intra_non_root of our distributed 
implementation sharing the ok with the front end at that node. Whenever an intra_non_root 
automaton learns that its front-end has sent its ok, and that the same is true of all the front ends 
at all descendent nodes, it communicates this fact to its parent in C. The process of conveying 
“okness” thus commences at the leaf nodes of C, and terminates at the root node of C. At the 
root node another component automaton (we shall refer to it as the “intra_root” automaton) waits 
until the convergecast terminates, and then outputs a cluster_ok action indicating that all nodes 
in its tree have output their oks for that round. 

An additional task performed by cluster_synch is to communicate, to all the nodes in its 
cluster, the receipt of the cluster_go action from the intercluster synchronizer. This is realized by 
broadcasting a message indicating receipt down the tree, from parent nodes to daughter nodes, 
until all the leaves have received this information. 

We now give a concise description of the the compostion’s functions in terms of actions ex- 
changed between constituent components. 

The intra_root automaton at the root node of each cluster waits until all the ok’s of the nodes in 
the cluster have been convergecast to it, and then output a cluster_ok. Upon receiving a cluster_go, 
it broadcasts this information down all subtrees. The intra_non_root automata, one at every non- 
root node of a cluster, assist in the above by conveying the okness of the front end at that and at 
all descendent nodes up the tree, and by broadcasting the subtree_go message output by the root 
node when intra_root receives a cluster _go (see fig 4.) 

If C is a subtree and p € nodes(C), let children(p, C) denote p’s children in C’,, and parent(p, C) 
be p’s parent in C (if p is not the root of the subtree). Also, let root(C’) denote the root node of C. 

We provide formal specifications of the described automata. 
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3.4.1 Root node automata of the intracluster synchroniser 
Automaton Intra_root(p,C), p = root(C) 


Signature 


t: 
ok(p,s) 
subtree_oh'"”(q,p, i), q € children(p, C) 


Output: 
g0(p, 3) 
subtr ee_go™* (P, q8), ¢ € children(p, C) 
cluster ok(C, s) 


Internal: 
none 
State 
array ok_recd(:) 
_ array go_sent(s) 
array subtree.ok_recd(q,i), ¢ € children(p, C) 
array subtree_go_sent(q, i), ¢ € children(p, C) 
array cluster ok_sent(i) 
array cluster _go_recd(:) 
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Transitions 


ok(p, 2) 
Effect: 
s.ok_recd(t) = true 


subtree_ok'”?(q, p, i) 
Effect: 
s.subtree_ok_recd(q,i) = true 


cluster _ok(C, t) 
Precondition: 
Vq € children(p, C), s'.subtree_ok_recd(q,t) = true 
s'.ok_recd(i) = true 
s'.cluster_ok_sent(i) = false 
Effect: 
s.cluster_ok_sent(i) = true 


cluster_go{C, i) 
Effect: 
s.cluster_go_recd(i) = true 


go(p, t) 

. Precondition: 
s'.cluster_go_recd(2) = true 
s'.go_sent(i) = false 

Effect: 
s.go_sent(i) = true 


subtree_go”™* (p,q, 7) 
Precondition: 
s'.cluster_go_recd(i) = true 
s'.subtree_go_sent(q,1) = false 
Effect: 
s.subtree_go_sent(g,7) = true 


Partitions 


All go(p, *) actions in one class, 
all subtree_go™*(p, q, *) actions in one class, for each q € children(p, C), 
all cluster _ok(C, *) in one class. 


3.4.2 Non-root node automata in the intracluster synchronizer 


Automaton Intra_non_root(p,C), p € nodes(C), p # root(C) 


Note that the only way intra_root and intra_non_root differ is that the former has cluster_go 
as an input, while the latter has subtree_go instead, and that the root automaton has cluster_ok 
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as an output unlike intra_non_root, which has subtree_ok instead. Functionally however, the two 
actions within each pair are identical, differing only in their automata of origin or destination. 


Input: 
ok(p, zt) 
subtree_ok'"? (q, p, 2), q € children(p, C) 
subtree_go'”?(r, p, t), r = parent(p, C) 


Output: 


go(p, 2) 
subtree_go™* (p,q,i), q € children(p, C) 
subtree_ok™* (p,r,2), r = parent(p, C) 


Internal: 
none 


State 


array ok_recd(?) 

array go_sent(z) 

array subtree_ok_recd(q, 1), q € children(p, C) 
array subtree_go_sent(q, 7), q € children(p, C) 
array subtree_ok_sent(t) 

array subtree_go_recd(i 
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Transitions 


ok(p, z) 
Effect: 
s.ok_recd(i) = true 


subtree_ok'”? (q, p, i) 
Effect: 
s.subtree_ok_recd(q,2) = true 


subtree_ok™ (p, r, i) 
Precondition: 
Yq € children(p, C), s'.subtree_ok_recd(q,i) = true 
s'.ok_recd(i) = true 
s'.subtree_ok_sent(i) = false 
Effect: 


s.subtree_ok_sent(t) = true 


subtree_go’”? (r, p, t) 
Effect: 
s.subtree_go_recd(t) = true 


g°(p, t) 

Precondition: 
s'.subtree_go_recd(t) = true 
s'.go_sent(z) = false 

Effect: 
$.go-sent(t) = true 


subtree_go™ (p,q, i) 

Precondition: 
s'.subtree_go_recd(t) = true 
s'.subtree_go_sent(q, i) = false 

Effect: 
s.subtree_go_sent(q,t) = true 


Partitions 


All go(p, «) actions in one class, 
all subtree_ok™(p, r, x) actions in one class, for r = parent(p), 
all subtree_go™'(p, q, *) actions in one class, for each g € children(p, C). 

We model the tree edges with I/O automata, and describe a simple automaton that represents 
an asynchronous bidirectional link between a parent and a daughter node that carries messages 
between their intra(_non)_root automata. 


3.4.3 Tree link automata 


Automaton Intra_treelink(p — q,C), p = parent(q, C) 
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Signature 


Input: 
subtree_go™* (p,q, 7) 
subtree_ok™* (q, p, 7) 
Output: 
subtree_go'”?(p, q, i) 
subtree_ok'"? (q, p, t) 
Internal: 
none 


State 


buf fer a first-in first-out queue of elements of arbitrary type, initially empty. 


Transition 


subtree_go°“* (p, q i) 
Effect: 
Add go(q,1) to buf fer 


subtree_ok”™* (q, p, 2) 
Effect: 
Add ok(q,2) to buf fer 


subtree_go'”? (p,q, i) 
Precondition: 
first(s'.buf fer) = go(q,t) 
Effect: 
s.buf fer = rest(s'.buf fer) 


subtree_ok*”? (q, p,i) 
Precondition: 
first(s'.buf fer) = ok(q,t) 
Effect: 
s.buf fer = rest(s’.buf fer) 


Partitions 


All subtree_go'"(p, q, *) actions in one class, 
all subtree_ok'"?(q, p,i) actions in one class. 

In the next section we justify using the described collection to implement cluster_synch by 
providing a proof of implementational correctness. 


3.5 Proof of implementation 


We will now show that the intra_root automaton, working with the intra non_root automata and the 
links, implements cluster_synch(C). Let TRA(C) be the composition of the intra_root(p,C), p = 
root(C’) automaton with all the intra_non_root(«,C’) automata and the intra_tree_link(* — *,C) 
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automata. Further, let intra(r,C) denote intra_root(r,C’) or intra_non_root(r,C) depending on 
whether or not r = p = root(C). 

Let TRA(C) = hideyT RA(C), where 

x = {nm € acts(TRA(C))|r is a subtree_ok'"?, subtree_ok™*, subtree_go’’?, or subtree_go™ 
action} 


The following theorem guarantees that every behavior of TRA(C) is a behavior of cluster_synch(C). 
Theorem 3.6 Vs € states(TRA(C)) let f be as below. 


f(s)= {t € states(cluster_synch(C))| Vr € nodes(C) 
t.ok_recd(r,i) = s/intra(r, C)].ok_recd(i) 
t.go_sent(r,i) = s[intra(r,C)].go-sent(i) 
t.cluster_ok_sent(i) = s/intra_root(p,C)].cluster_ok_sent(i) 
t.cluster_go_recd(i) = s[intra_root(p,C)].cluster_go_recd(i) } 


Then f is a possibilities mapping from TRA(C) to cluster_synch(C). 


Proof: 
If s € start(T RA(C)) the existence of the start state in cluster_synch(C) consistent with f is 
easily verified. Let s’ be a reachable state of TRA(C) , t' any state in f(s’) and (s’,7, s) a step 


of TRA(C). We need to produce an extended step of cluster_synch(C) (t',y,t) such that t € f(s) 
and y|ext(cluster_synch(C)) = nlext(TRA(C)). 


e7med 
Taking 7 to be the null sequence of actions, and noting that for all steps (s', 7, s) such that 
néD f(s') = f(s), we have f being a valid mapping, as t = t' € f(s’) = f(s) 


e 7 € inputs(cluster_synch(C)) 
We take y = m. Since y is an input, it follows that (t',y,t) is a step (¢ is the state 
cluster_synch(C) winds up in after y). That t € f(s) can be verified by inspection in both 
the two cases (7 = ok(p,7) or cluster_go(C, t)). 


e m= go(r,i), r € nodes(C) 
As before, take y = 7. If r = p, legality of f follows directly from the code implementing this 
action in loc_synch and intra_root(r,C). If not, 7 is output by intra_non_root(r,C), which 
would imply that s'[intra_non_root(r, C’)].subtree_go_recd(i) = true and 
s'[intra_non_root(r, C)].go-sent(i) = false. From the second equality, we immediately have 
t'[cluster_synch(C)].go_sent(p, i) = false. All we have to show is that when cluster_synch(C) 
reaches state t’ it has already received the cluster_go for the ith round. 


Claim 3.1 For any r € nodes(C),r # root(C), let q;, 7 = 0,1,---+,d be the nodes in C such 
that qo = 7, qa = root(C) and Vj < d, 9;4, = parent(q,,C). 
If TRA(C) is in state s' such that s'[intra_non_root(q., C)|.subtree_go_recd(i) = true, then 


1. Vj < d,s'[intra_non_root(q,, C)].subtree_go-recd(i) = true 


2. s'|intra_root(qa, t)|.clustergo_recd(i) = true. 
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Proof of claim: 


1. By induction on 7. 


2. From Claim 3.1.1 s’.[intra_non_root(qz_,,C)].subtree_gorecd(i) = true. From the im- 
plementation of the intra_root automaton, the claim follows. 


Action y will thus be enabled in ¢’ and ¢ such that (¢', y,t) is a step of cluster_synch(C) will 
be in f(s). 


e x = cluster_ok(C, i) 
Taking y = 7. If intra_root receives subtree_ok messages from all its children, it needs to 
be shown that at the intra_non_root automata at all descendent nodes, the ok_recd received 
variable for that round is true. That y will be enabled in cluster_synch(C) will follow imme- 
diately, and the proof would be done. 
Claim 3.2 Vs! € states(T RA(C)) if s'[intra_root(p, C)].cluster_ok_sent(t) = true then Vr € 
nodes(C), s‘[intra(r, C)].subtree_ok_sent(i)] = true 


Proof of claim: For any node r in C, let depth(r) denote its distance from root(C), 
along the unique path entirely in C between r and root(C’). We prove the claim by induc- 
tion on the depth of a node r. For a node r at depth 1, since r is a child of the root node, 
s'[intra_root(p, C)].cluster_ok_sent(i) = true implies that s’[intra(r, C)].subtree ok_sent(z) = 
true. 

Assume that at any node r’ at a depth d—1, s'[intra(r',C)].subtree_ok_sent(i)] = true. 
Consider a node r at a depth d, which has q at depth d — 1 as the parent. The required 
condition follows directly from the implementation of the subtree_ok™*(r, q, 7) action. 


From the above claim and the preconditions of the subtree_ok™ action we have, for all nodes 
q, s'[intra(q, C)].ok_recd(i) = true, as required. 

a 

We are thus guaranteed that behs(TRA(C)) C behs(cluster_synch(C)). But behaviors of 


TRA(C) are clearly projections of behaviors of TRA(C) onto cluster_synch(C). Hence, TRA(C) 
implements cluster _synch(C). 


3.6 Fairness of Implementation 


We now argue that TRA is a fairness preserving implementation. Automaton cluster_synch(C), for 
some subtree C' in forest, has cluster_ok(C, *) and go(p, *) for each p € nodes(C) as fairness par- 
titions. As in earlier fairness arguments, it can be shown that unfair behaviors of cluster_synch(C) 
can be generated only from unfair behaviors of TRA(C). 


Theorem 3.7 Let § be a fair behavior of TRA(C), for some subtree C in forest. Then y = 
A\|cluster_synch(C) is a fair behavior of automaton cluster_synch(C). 
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Proof: 

Theorem 3.6 guarantees that 7 is a behavior of cluster_synch(C). Assume that y is an unfair 
behavior of cluster_synch(C), and let T be the unique execution of that automaton which has 7 as 
its behavior. 

Consider the case when the action go(p, 2) is enabled in but never performed, for some i and p 
in V. If pis the root node of the subtree C, a contradiction is immediate: The enabling of go(p, ¢) 
implies that cluster_go(C,i) occurs in y, and hence in f as well. In that event, go(p, i) is enabled 
in the execution corresponding to 6 as well, in the automaton intra_root(p,C). Since @ is a fair 
behavior, this results in that action being performed, which is a contradiction. 

If p is a non-root node of C, at a depth d > 1 from the root node, a inductive argument yields 
a contradiction: If go(p,z) does not occur in the fair behavior 8 of TRA(C), it follows that it is 
never enabled during that execution. Automaton intra_non_root(p,C) therefore never receives a 
subtree_go message from the automaton intra(parent(p,C),C) at level d— 1. This can happen 
only if intra(parent(p,C),C) never receives a subtree_go from its parent. Thus if p is such that 
go(p, t) is enabled but never performed in I, then it follows by induction that the the intra_non_root 
automaton at depth 1 on the path joining the root to p never receives a subtree_go message. Since 
this can happen only if cluster _go(p, 7) is not performed, we have a contradiction. 


We now consider the case when, for some 7, the action cluster_ok(C, 2) is enabled, but never 
performed in I’. If cluster_ok is enabled in cluster_synch(C), then it follows that all the front-ends 
in C' have sent in their oks for that round to cluster_synch(C’). Thus all these oks appear in 6 as 
well. 

If cluster_ok is not performed in the execution corresponding to 6 (in TRA(C)), it would have 
to have been disabled throughout that execution, for otherwise 6 would not be fair. This implies 
that one of the root node’s child nodes never sends in its subtree_ok action for that round. By the 
same argument, if a node at depth d does not ever receive a subtree_ok from one of its children for 
some round, this can happen only if that child node did not receive such an action from one of zts 
children, at depth d+ 1. 

This inductive argument leads to the fact that at least one leaf node intra_non_root automaton 
never sends in a subtree_ok action for that round. But every descending branch of the subtree ends 
in a leaf node whose subtree_ok action is enabled if all the oks for that round appear in #, and, if 
f is fair, the automaton at this node does perform the subtree_ok action at some point in 3. We 
thus derive a contradiction to our assumption that fair behaviors of TRA(C) can generate unfair 
behaviors of cluster _synch(C). a 

We saw in an earlier section that loc_synch can be implemented as a combination of an intra- 
cluster synchronizer (as described in this section) and an intercluster synchronizer that synchronizes 
between clusters. We now direct our attention to the latter automaton, and put together a collection 
of automata that will implement the intercluster synchronizer forest_synch fairly. 


3.7 Intercluster Synchronization 


We will describe a collection of automata that will implement forest_synch. This automaton’s 
function resembles that of loc_synch rather closely: It detects the condition that a cluster and all 
its neighboring clusters are “safe” (have sent in their cluster_oks) and communicates this fact to 
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Figure 5: The root-node component automaton of the intercluster synchronizer. 


the intracluster synchronizer for that cluster. 

This requires each intracluster to share actions with a component of forest_synch’s distributed 
implementation. In our (and Awerbuch’s) implementation, this constraint is met by having a key 
component automaton (that we shall call “inter_root”) of forest_synch’s implementation run on 
each root-node of a cluster (which has the intra_root automaton that generates the cluster_ok 
action, and receives the cluster_go action). 

Communication between any two neighboring clusters in the distributed implementation occurs 
through a predetermined set of edges that form a path between the root nodes of the two cluster. 
All but one of the path’s edges are chosen from the spanning trees of either cluster. We call 
the non-tree edge, the preferredlink between the two clusters, and denote the set of all such 
cluster-bridging edges by pre ferred_links( forest). 

We now describe the operation of the distributed implementation. 

The distributed implementation of forest_synch consists of the following in each cluster: a 
inter_root automaton at the root of the cluster, and a inter_non_root automaton at each non-root 
node of the cluster. 

The functions of the inter_root automaton are twofold. 


e Upon being told by cluster_synch that the cluster is safe, it will broadcast this down to the 
leaves, through the subtrees, so that neighboring clusters can get this information via bridging 
preferred links. 


e Messages from neighboring clusters telling the cluster in question that they are safe are 
convergecast to the inter_root automaton. After checking to see if its own cluster is safe, it 
will then give the cluster_synch automaton (also at the root node) a cluster_go. 


Figure 5 summarizes the inter_root automaton’s functioning pictorially. 


The inter_non_root automaton’s functioning resembles that of inter_root closely. Upon receiv- 
ing relay_ok_recd passed down to it along the path joining it to inter_root, inter_non_root passes 
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Figure 6: The non-root node component automaton of the intercluster synchronizer. 


it along to its children, and any preferred links that may be incident upon it. This behaviour corre- 
sponds to that exhibited by the root automaton when it receives a cluster_safe from cluster_synch. 
Another important task performed by inter_non_root is to help convergecast readiness to the root. 
An inter_non_root automaton is ready if 


e All its children are ready and 
e if it has received relay_cluster_oks along all preferred edges incident upon it. 


This is merely the condition that the cluster itself, all clusters neighboring it and (neighboring) 
all its descendant nodes are known to be safe. A pictorial representation of the inter_non_root 
automaton is given in fig 6. 

Working together, these automata implement forest_synch. As before, we give the formal 
specifications of the automata described above, prove the implementation correct, and conclude 
with a few remarks arguing that all fair executions of the implementation generate fair executions 
of the implemented system. 


3.7.1 Root node automata in the intercluster synchronizer 


Automaton inter_root(p,C), p = root(C), where C is a subtree of forest. 


We describe the component automaton of forest_synch that runs on the root node of every subtree 
C of forest. 


Input: 
cluster_ok(C,z) 
relay-cluster_ok'"?(q,p,z), (p,q) € preferred_links( forest) 
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ready'"?(q,i), ¢ € children(p) 
Output: 

cluster _go(C, i) 

relay_cluster_ok*(p,q,7), q € children(p) or (p,q) € preferred_links( forest) 
Internal: 

none 


State 


array cluster _ok_recd(t) 

array relay_ok_recd(q, i), (p,q) € preferredtinks( forest) 
array ready_recd(q,i), q € children(p) 

array cluster_go_sent(i) 

array relay_ok_sent(q, 7), (p,q) € preferred links( forest) 


Transition 


cluster _ok(C, 2) 
Effect: 
s.cluster_ok_recd(t) = true 


relay-cluster_ok*"?(q,p, i) 
Effect: 
s.relayok_recd(q,1) = true 


ready'*™P (q, 2) 
Effect: 
s.ready_recd(q,t) = true 


cluster _go(C, i) 
Precondition: 
Vq such that (p,q) € preferred_links(forest), s'.relay.ok_recd(q,i) = true 
Vq € children(p), s'.ready-recd(q,i) = true 
s’.cluster.ok_recd(i) = true 
s'.cluster_go_sent(i) = false 
Effect: 


s.cluster_go_sent(z) = true 


relay_cluster_ok™*(p, q, 1) 
Precondition: 
s' cluster _ok_recd(t) = true 
s'.relayok_sent(q,t) = false 
Effect: 


s.relay_ok_sent(q,1) = true 


Partitions 


All cluster_go(C, *) actions in one class, 
all relay_cluster ok (p, q, *) actions in one class, for all p and q such that (p, q) € preferredlinks( forest). 
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We now describe, in formal terms, the component automaton on each non-root node in a subtree 


C of forest. 


3.7.2 Non-root node automata in the intercluster synchronizer 


Automaton inter_non_root(p,C'), p € nodes(C), p # root(C) and C is a subtree of forest 


Signature 


Input: 
relay-cluster_ok'”?(r, p, 1), r = parent(p) or (p,r) € preferred_links( forest) 
ready'"? (q, 2), q € children(p) 
Output: 
ready (r,i), r = parent(p) 
relay_cluster_ok™' (p,q,t), q € children(p) or (p,q) € preferred_links( forest) 
Internal: 
none 


rae 


State 


array cluster_ok_recd(i) 

array relay_ok_recd(q, 1), (p,q) € preferredtinks( forest) 
array ready-_recd(q, 2), q € children(p) 

array ready_sent(t) 

array relay_ok_sent(q,i), (p,q) € preferred _links( forest 
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Transition 


relay_cluster_ok'"?(r,p, i) 
Effect: 
if r = parent(p) then s.cluster_ok_recd(t) = true 
else s.relay_ok_recd(r,i) = true 


ready’? (4,2) 
Effect: 
s.ready_recd(q,t) = true 


ready (r,i), r = parent(p) 
Precondition: 
Vq such that (p,q) € preferred_links(forest), s'.relay-ok-recd(q,%) = true 
Vq € children(p), s'.ready-recd(q,1) = true 
s'.cluster_ok-recd(t) = true 
s'.ready_sent(z) = false 
Effect: 


s.ready_sent(i) = true 


relay.cluster_ok™* (p, q, 7) 
Precondition: 
s'.cluster_ok_recd(t) = true 
s'.relay_ok_sent(q,i) = false 
Effect: 
s.relay_ok_sent(q,2) = true 


Partitions 


All ready™*(r, *) in one class, r = parent(p), 

all relay_cluster _ok™* (p, q, *) actions in one class, for all p and g such that (p,q) € preferredlinks( forest). 
As before, we model asynchronous communication channels by I/O automata, and provide, 

in the next two subsections, formal specifications of link automata which represent tree edges 

(inter tree link automata) and those that represent preferred links (pre ferredilink automata) 


3.7.3 Tree link automata 


Automaton Inter treelink(p — q,C), p = parent(q, C). 
This automaton allows inter_non_root(p,C’) or inter_root(p,C) and inter_non_root(q,C') to ex- 
change messages. 


Signature 


Input: 
ready" (q,t) 
relay.cluster ok (p, q, 2) 
Output: 
ready’? (q, i) 
relay_cluster_ok'”?(p, q, i) 
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Internal: 
none 


State 


buf fer a fifo queue of any type 


Transition 
ready (q, t) 
Effect: 
Add ready(z) to buf fer 
relay_cluster_ok™*(p, q, i) 
Effect: 
Add ok(i) to buf fer 
ready’? (a, #) 
Precondition: 
first(s'.buf fer) = ready(z) 
Effect: 
s.buf fer = rest(s'.buf fer) 
relay_cluster_ok'"? (p, g, i) 
Precondition: 
first(s'.buf fer) = ok(2) 
Effect: 
s.buf fer = rest(s'.buf fer) 
Partitions 


All ready*”?(q, x) actions in one class, 
all relay_cluster_ok’”? (p, q, *) actions in one class. 


3.7.4 Preferred link intercluster automata 


Automaton Pre ferredlink(p — r,Cp,C,), p € Cp,r € C,, Cp and C, distinct trees in forest. 
As mentioned elsewhere, preferred links assist in intercluster communication: they form the com- 
munication channels through which each cluster conveys its “okness” to neighboring clusters. 


Input: 
relay_cluster_ok™* (p, r,t) 
relay-cluster_ok™'(r,p,t) 
Output: 
relay-cluster_ok'”? (p, r,t) 
relay-cluster_ok'”?(r, p, t) 
Internal: 
none 
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State 


buf fer a first-in first-out queue of elements of arbitrary type, initially empty. 


Transition 


relay-cluster_ok™*(p,r, i) 
Effect: 
Add ok(p,r,2) to buf fer 


relay_cluster_ok**(r, p, t) 
Effect: 
Add ok(r,p,t) to buf fer 


relay_cluster_ok'”? (p, r,t) 
Precondition: 
first(s'.buf fer) = ok(p, r,t) 
Effect: 
s.buf fer = rest(s'.buf fer) 


relay_clusterok'"? (r, p, i) 
Precondition: 
first(s'.buf fer) = ok(r, p, 1) 
Effect: 
s.buf fer = rest(s'.buf fer) 


Partitions 


All relay_cluster ok’? (p, r, *) actions in one class, 
all relay_cluster_ok’"?(r, p, *) actions in one class. 

Having thus described the distributed implementation of forest_synch, we have now to prove 
correctness, as before. We again use a possibilities mapping proof to show behavioral inclusion. 


3.8 Proof of implementation 


Let TER denote the composition of all the inter_root automata, all the inter_non_root automata 
and all the link automata. Take © = {m € acts(TER) | 7 is an action that is not of the form 
cluster _go(*, *) or cluster_ok(x, x)} and TER = hides TER. Further, let TE R(C,p), p € nodes(C) 
denote the inter_(non)-_root automaton at the node p in C. 


Theorem 3.8 Vs € states(TER), 


f(s) = {t © states(forest_synch(forest)) | VC, a subtree in forest, 
t.cluster_ok_recd(C,i) = s/T ER(C,root(C))/.cluster_ok_recd (i) 
t.cluster_go_sent(C,i) = s/T ER(C,root(C))].cluster_go_sent (1) 


Then f is a possibilites mapping from TER to forest_synch. 
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Proof: If s € start(TER), the existence of the start state in forest_synch consistent with f 
is easily verified. Let s’ be a reachable state of TER, t' any state in f(s') and (s’,7, s) a step 
of TER. We need to produce an extended step (t’, y,t) of forest_synch such that ¢t € f(s) and 
ylext(forest_synch) = mlext(TER). We distinguish two cases depending on 7, and take y to be 7 
in both cases. 

If 7 = cluster_ok(C, 1) for any C, a cluster, the proof of correctness of f is immediate as 7 is 
an input action with the right effects in both automata. Otherwise, 7 = cluster_go(C,7). In this 
case the only non-trivial argument we have to make is to show that if cluster_go is enabled in TER 
then the intra-root automaton in all the neighboring clusters have performed the cluster_ok action 
for that round. 

Consider any neighboring cluster C’ such that the preferred link between C’ and C” is not 
incident upon either of their root nodes. If C’ is not safe, we show that one of root(C)’s children 
will not be ready - to be precise, the child node on the path containing the preferred link between 
the two clusters will have not sent a ready action yet. 

We know there exists a sequence of nodes to, U1, «--, Un; Un!; Un'—1) ++) V1, Vo Such that the u;’s 
form a path from up = root(C) to a leaf of C, the v,’s form a similar path in C’ from vp = root(C’) 
and the edge (tn, Un) between the two leaves is a preferred link in G. 

If, in state s’, s'[TER(C", vo)|.cluster_ok_recd(i) # true, then by the implementation of the 
inter-non-root automata at the descendent nodes, s’'[TER(C', v;)].cluster_ok_recd(i) # true. With 
this as the base case, an inductive argument shows that for all vj, 0 <j <7’, 
s'[TER(C", v;)|.cluster_ok_recd(i) 4 true. We exploit this fact to complete the proof. 

If s'[TER(C", vp:)|.cluster_ok_recd(i) = false, then the automaton on the other side of the 

preferred link (across from TE R(C"',v,")) could not have received a relay_cluster_ok from the 
inter-(non-)root automaton at vn, that is, s'[TER(C', vp)].relay_ok_recd(vn',1) # true. 
Thus s’.[TER(C’, vn)|.ready_sent(t) is false, as is s’.[TER(C', vn_1)|.ready_recd(v,, 7). By induc- 
tion on smaller values of j, for all j > 0, s’.[TER(C’, v;)].ready_recd(v;41,%) = false. Hence we 
see that if s’[TER(C', root(C))].ready recd(q, 2) is true for all q € children(root(C)), then all the 
neighboring clusters considered above will have sent their cluster_oks for that round already. 

In the case when the preferred link between C and C’ does touch either of their roots, the above 
proof lends to an easy modification, and we can conclude that if all the relay_ok_recd actions and 
ready-_recd actions are in through the preferred links, then none of the neighboring clusters can be 
unsafe. 

The correctness of f is thus guaranteed, and hence behs(T ER) C behs(forest_synch). 


From the preceding theorem, we can conclude that TER implements forest_synch correctly, 
as both TER and TER project identically on the automaton forest_synch. We turn our attention 
to fairness issues next. 


3.9 Fairness of Implementation 


The composition automaton TER has the following property: Fair executions of TER (or equiv- 
alently, executions that are fair to each of its component automata) generate fair behaviors of 
forest_synch. We outline below the argument supporting this claim. 
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Automaton forest_synch’s only locally controlled class of actions consists of actions of the form 
cluster _go(C, *), where C’ is a subtree of forest. 
As before, we need to show that the generated executions of forest_synch are fair to each of these 
classes of actions. It will be shown that unfair behaviors can be generated only from executions 
of the implementation that are unfair to the automata that generate the cluster_gos. We are thus 
guaranteed that fair behaviors of TER will generate just fair behaviors of forest_synch. 


Theorem 3.9 Let 6 be a fair behavior of TER. Then y = A|forest_synch is a fair behavior of 
that automaton. 


Proof: Theorem 3.8 guarantees that 7 is a behavior of forest_synch. Assume that y is an unfair 
behavior of forest_synch, and let T be the unique execution of that automaton which has y as 
its behavior. Then there is a subtree C, and a smallest 7 such that the action cluster_go(C, 7) is 
enabled in [ but never performed in either I or G. 

If cluster_go(C, 7) is enabled in I, then, for all subtrees C’ such that C’ is a neighboring cluster 
of C or C itself, the action cluster_ok(C’,i) must occur in I. Using the fact that 6 is fair, an 
inductive argument will show that the occurrence (in 8) of each cluster_ok(C",7) at a neighboring 
cluster will cause a relay_cluster_ok(p, q,1) action to be performed, for some p in nodes(C’) and 
q in nodes(C) (the induction is on the distance of each node from root(C’) on the unique path 
between root(C’) and q, the node in C across a preferred-link from a node in C’). 

We are thus guaranteed that if 6 is fair, but does not contain cluster_go(C;, 1), it is because some 
inter_non_root automaton at one of root(C)’s child nodes has not performed a ready™* action. A 
second inductive argument can now be used to show that there exists a sequence of nodes in C, 
V0, V1,++*¥g, such that vp = root(C), v;_1 = parent(v;), for i = 1,---d, and vq is a leaf node of C, 
with the property that no ready™‘(v;, 7) occurs in 6, for 1 <j <d. 

But ready™*(vg,z) is enabled in 6: the relay-cluster_oks occur in @ if there are incident 
preferred-links onto vg, and we already know that all the relevant cluster_oks occur in 8, thus 
enabling that action. If @ is fair, this action will then occur in that sequence, thus yielding a 
contradiction. 


We thus have a node-level implementation of a collection of automata, whose composition 
synchronizes clients in a provably correct way. 

We end this report in the next section with a summary of our efforts and some noteworthy 
features of this proof of correctness. 


4 Summary 


In this paper we have offered a formal, rigorous proof of the correctness of Awerbuch’s algorithm 
for network synchronization. We specified both the algorithm and the correctness condition using 
the I/O automaton model. Since the algorithm uses simpler algorithms for synchronization within 
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and between ‘clusters’ of nodes, our proof could have imported as lemmas the correctness of these 
simpler algorithms, had their correctness been proved elsewhere. Alternatively, the understanding 
of the modularity that the proof gives us would allow us to see how to safely change the choices of 
implementation of the separate parts of the synchronizer, independently of one another. Also, we 
clearly benefit from having carried out the correctness proof in the I/O automaton model which 
supports modularity, since the network synchronizer is often used as an ‘off-the-shelf building block’ 
component in a larger system, and proofs of the correctness of the system will be able to use our 
proof without change. 
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